32 research outputs found

    MultiModN- Multimodal, Multi-Task, Interpretable Modular Networks

    Full text link
    Predicting multiple real-world tasks in a single model often requires a particularly diverse feature space. Multimodal (MM) models aim to extract the synergistic predictive potential of multiple data types to create a shared feature space with aligned semantic meaning across inputs of drastically varying sizes (i.e. images, text, sound). Most current MM architectures fuse these representations in parallel, which not only limits their interpretability but also creates a dependency on modality availability. We present MultiModN, a multimodal, modular network that fuses latent representations in a sequence of any number, combination, or type of modality while providing granular real-time predictive feedback on any number or combination of predictive tasks. MultiModN's composable pipeline is interpretable-by-design, as well as innately multi-task and robust to the fundamental issue of biased missingness. We perform four experiments on several benchmark MM datasets across 10 real-world tasks (predicting medical diagnoses, academic performance, and weather), and show that MultiModN's sequential MM fusion does not compromise performance compared with a baseline of parallel fusion. By simulating the challenging bias of missing not-at-random (MNAR), this work shows that, contrary to MultiModN, parallel fusion baselines erroneously learn MNAR and suffer catastrophic failure when faced with different patterns of MNAR at inference. To the best of our knowledge, this is the first inherently MNAR-resistant approach to MM modeling. In conclusion, MultiModN provides granular insights, robustness, and flexibility without compromising performance.Comment: Accepted as a full paper at NeurIPS 2023 in New Orleans, US

    Modular Clinical Decision Support Networks (MoDN)-Updatable, interpretable, and portable predictions for evolving clinical environments.

    Get PDF
    Clinical Decision Support Systems (CDSS) have the potential to improve and standardise care with probabilistic guidance. However, many CDSS deploy static, generic rule-based logic, resulting in inequitably distributed accuracy and inconsistent performance in evolving clinical environments. Data-driven models could resolve this issue by updating predictions according to the data collected. However, the size of data required necessitates collaborative learning from analogous CDSS's, which are often imperfectly interoperable (IIO) or unshareable. We propose Modular Clinical Decision Support Networks (MoDN) which allow flexible, privacy-preserving learning across IIO datasets, as well as being robust to the systematic missingness common to CDSS-derived data, while providing interpretable, continuous predictive feedback to the clinician. MoDN is a novel decision tree composed of feature-specific neural network modules that can be combined in any number or combination to make any number or combination of diagnostic predictions, updatable at each step of a consultation. The model is validated on a real-world CDSS-derived dataset, comprising 3,192 paediatric outpatients in Tanzania. MoDN significantly outperforms 'monolithic' baseline models (which take all features at once at the end of a consultation) with a mean macro F1 score across all diagnoses of 0.749 vs 0.651 for logistic regression and 0.620 for multilayer perceptron (p < 0.001). To test collaborative learning between IIO datasets, we create subsets with various percentages of feature overlap and port a MoDN model trained on one subset to another. Even with only 60% common features, fine-tuning a MoDN model on the new dataset or just making a composite model with MoDN modules matched the ideal scenario of sharing data in a perfectly interoperable setting. MoDN integrates into consultation logic by providing interpretable continuous feedback on the predictive potential of each question in a CDSS questionnaire. The modular design allows it to compartmentalise training updates to specific features and collaboratively learn between IIO datasets without sharing any data

    Systematic review of outcome parameters following treatment of chronic exertional compartment syndrome in the lower leg

    Get PDF
    Objective: Surgery is the gold standard in the management of chronic exertional compartment syndrome (CECS) of the lower extremity, although recent studies also reported success following gait retraining. Outcome parameters are diverse, and reporting is not standardized. The aim of this systematic review was to analyze the current evidence regarding treatment outcome of CECS in the lower leg. Material and Methods: A literature search and systematic analysis were performed according to the PRISMA criteria. Studies reporting on outcome following treatment of lower leg CECS were included. Results: A total of 68 reports fulfilled study criteria (n =; 3783; age range 12-70 year; 7:4 male-to-female ratio). Conservative interventions such as gait retraining (n =; 2) and botulinum injection (n =; 1) decreased ICP ((Formula presented.) =; 68 mm Hg to (Formula presented.) =; 32 mm Hg) and resulted in a 47% (±42%) rate of satisfaction and a 50% (±45%) rate of return to physical activity. Fasciotomy significantly decreased ICP ((Formula presented.) =; 76 mm Hg to (Formula presented.) =; 24 mm Hg) and was associated with an 85% (±13%) rate of satisfaction and an 80% (±17%) rate of return to activity. Return to activity was significantly more often achieved (P <.01) in surgically treated patients, except in one study favoring gait retraining in army personnel. Conclusion: Surgical treatment of CECS in the lower leg results in higher rates of satisfaction and return to activity, compared to conservative treatment. However, the number of studies is limited and the level of evidence is low. Randomized controlled trials with multiple treatment arms and standardized outcome parameters are needed

    Beyond spectral gap (extended): The role of the topology in decentralized learning

    No full text
    Extended version of the other paper (with the same name), that includes (among other things) theory for the heterogeneous case. arXiv admin note: substantial text overlap with arXiv:2206.03093In data-parallel optimization of machine learning models, workers collaborate to improve their estimates of the model: more accurate gradients allow them to use larger learning rates and optimize faster. In the decentralized setting, in which workers communicate over a sparse graph, current theory fails to capture important aspects of real-world behavior. First, the `spectral gap' of the communication graph is not predictive of its empirical performance in (deep) learning. Second, current theory does not explain that collaboration enables larger learning rates than training alone. In fact, it prescribes smaller learning rates, which further decrease as graphs become larger, failing to explain convergence dynamics in infinite graphs. This paper aims to paint an accurate picture of sparsely-connected distributed optimization. We quantify how the graph topology influences convergence in a quadratic toy problem and provide theoretical results for general smooth and (strongly) convex objectives. Our theory matches empirical observations in deep learning, and accurately describes the relative merits of different graph topologies. This paper is an extension of the conference paper by Vogels et. al. (2022). Code: https://github.com/epfml/topology-in-decentralized-learning

    Beyond spectral gap (extended): The role of the topology in decentralized learning

    No full text
    Extended version of the other paper (with the same name), that includes (among other things) theory for the heterogeneous case. arXiv admin note: substantial text overlap with arXiv:2206.03093In data-parallel optimization of machine learning models, workers collaborate to improve their estimates of the model: more accurate gradients allow them to use larger learning rates and optimize faster. In the decentralized setting, in which workers communicate over a sparse graph, current theory fails to capture important aspects of real-world behavior. First, the `spectral gap' of the communication graph is not predictive of its empirical performance in (deep) learning. Second, current theory does not explain that collaboration enables larger learning rates than training alone. In fact, it prescribes smaller learning rates, which further decrease as graphs become larger, failing to explain convergence dynamics in infinite graphs. This paper aims to paint an accurate picture of sparsely-connected distributed optimization. We quantify how the graph topology influences convergence in a quadratic toy problem and provide theoretical results for general smooth and (strongly) convex objectives. Our theory matches empirical observations in deep learning, and accurately describes the relative merits of different graph topologies. This paper is an extension of the conference paper by Vogels et. al. (2022). Code: https://github.com/epfml/topology-in-decentralized-learning

    Beyond spectral gap: The role of the topology in decentralized learning

    Full text link
    In data-parallel optimization of machine learning models, workers collaborate to improve their estimates of the model: more accurate gradients allow them to use larger learning rates and optimize faster. We consider the setting in which all workers sample from the same dataset, and communicate over a sparse graph (decentralized). In this setting, current theory fails to capture important aspects of real-world behavior. First, the 'spectral gap' of the communication graph is not predictive of its empirical performance in (deep) learning. Second, current theory does not explain that collaboration enables larger learning rates than training alone. In fact, it prescribes smaller learning rates, which further decrease as graphs become larger, failing to explain convergence in infinite graphs. This paper aims to paint an accurate picture of sparsely-connected distributed optimization when workers share the same data distribution. We quantify how the graph topology influences convergence in a quadratic toy problem and provide theoretical results for general smooth and (strongly) convex objectives. Our theory matches empirical observations in deep learning, and accurately describes the relative merits of different graph topologies.Comment: Under revie
    corecore